Search CORE

17 research outputs found

SEQUENTIAL DECISION MAKING WITH LIMITED RESOURCES

Author: Sankararaman Karthik Abinav
Publication venue
Publication date: 01/01/2019
Field of study

One of the goals of Artificial Intelligence (AI) is to enable multiple agents to interact, co-ordinate and compete with each other to realize various goals. Typically, this is achieved via a system which acts as a mediator to control the agents' behavior via incentives. Such systems are ubiquitous and include online systems for shopping (e.g., Amazon), ride-sharing (e.g., Uber, Lyft) and Internet labor markets (e.g., Mechanical Turk). The main algorithmic challenge in such systems is to ensure that they can operate under a variety of informational constraints such as uncertainty in the input, committing to actions based on partial information or being unaffected by noisy input. The mathematical framework used to study such systems are broadly called \emph{sequential decision making} problems where the algorithm does not receive the entire input at once; it obtains parts of the input by interacting (also called "actions") with the environment. In this thesis, we answer the question, under what informational constraints can we design efficient algorithms for sequential decision making problems. The first part of the thesis deals with the Online Matching problem. Here, the algorithm deals with two prominent constraints: uncertainty in the input and choice of actions being restricted by a combinatorial constraint. We design several new algorithms for many variants of this problem and provide provable guarantees. We also show their efficacy on the ride-share application using a real-world dataset. In the second part of the thesis, we consider the Multi-armed bandit problem with additional informational constraints. In this setting, the algorithm does not receive the entire input and needs to make decisions based on partial observations. Additionally, the set of possible actions is controlled by global resource constraints that bind across time. We design new algorithms for multiple variants of this problem that are worst-case optimal. We provide a general reduction framework to the classic multi-armed bandits problem without any constraints. We complement some of the results with preliminary numerical experiments

Digital Repository at the University of Maryland

Algorithms to Approximate Column-Sparse Packing Problems

Author: Brubach Brian
Sankararaman Karthik Abinav
Srinivasan Aravind
Xu Pan
Publication venue
Publication date: 05/08/2019
Field of study

Column-sparse packing problems arise in several contexts in both deterministic and stochastic discrete optimization. We present two unifying ideas, (non-uniform) attenuation and multiple-chance algorithms, to obtain improved approximation algorithms for some well-known families of such problems. As three main examples, we attain the integrality gap, up to lower-order terms, for known LP relaxations for k-column sparse packing integer programs (Bansal et al., Theory of Computing, 2012) and stochastic k-set packing (Bansal et al., Algorithmica, 2012), and go "half the remaining distance" to optimal for a major integrality-gap conjecture of Furedi, Kahn and Seymour on hypergraph matching (Combinatorica, 1993).Comment: Extended abstract appeared in SODA 2018. Full version in ACM Transactions of Algorithm

arXiv.org e-Print Archive

Crossref

Balancing Relevance and Diversity in Online Bipartite Matching via Submodularity

Author: Dickerson John P.
Sankararaman Karthik Abinav
Srinivasan Aravind
Xu Pan
Publication venue
Publication date: 12/11/2018
Field of study

In bipartite matching problems, vertices on one side of a bipartite graph are paired with those on the other. In its online variant, one side of the graph is available offline, while the vertices on the other side arrive online. When a vertex arrives, an irrevocable and immediate decision should be made by the algorithm; either match it to an available vertex or drop it. Examples of such problems include matching workers to firms, advertisers to keywords, organs to patients, and so on. Much of the literature focuses on maximizing the total relevance---modeled via total weight---of the matching. However, in many real-world problems, it is also important to consider contributions of diversity: hiring a diverse pool of candidates, displaying a relevant but diverse set of ads, and so on. In this paper, we propose the Online Submodular Bipartite Matching (\osbm) problem, where the goal is to maximize a submodular function

f

over the set of matched edges. This objective is general enough to capture the notion of both diversity (\emph{e.g.,} a weighted coverage function) and relevance (\emph{e.g.,} the traditional linear function)---as well as many other natural objective functions occurring in practice (\emph{e.g.,} limited total budget in advertising settings). We propose novel algorithms that have provable guarantees and are essentially optimal when restricted to various special cases. We also run experiments on real-world and synthetic datasets to validate our algorithms.Comment: To appear in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression

Author: Foster Dylan J.
Sankararaman Karthik Abinav
Slivkins Aleksandrs
Publication venue
Publication date: 24/03/2023
Field of study

We consider contextual bandits with linear constraints (CBwLC), a variant of contextual bandits in which the algorithm consumes multiple resources subject to linear constraints on total consumption. This problem generalizes contextual bandits with knapsacks (CBwK), allowing for packing and covering constraints, as well as positive and negative resource consumption. We provide the first algorithm for CBwLC (or CBwK) that is based on regression oracles. The algorithm is simple, computationally efficient, and admits vanishing regret. It is statistically optimal for the variant of CBwK in which the algorithm must stop once some constraint is violated. Further, we provide the first vanishing-regret guarantees for CBwLC (or CBwK) that extend beyond the stochastic environment. We side-step strong impossibility results from prior work by identifying a weaker (and, arguably, fairer) benchmark to compare against. Our algorithm builds on LagrangeBwK (Immorlica et al., FOCS 2019), a Lagrangian-based technique for CBwK, and SquareCB (Foster and Rakhlin, ICML 2020), a regression-based technique for contextual bandits. Our analysis leverages the inherent modularity of both techniques

arXiv.org e-Print Archive

New Algorithms, Better Bounds, and a Novel Model for Online Stochastic Matching

Author: Brubach Brian
Sankararaman Karthik Abinav
Srinivasan Aravind
Xu Pan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 24th Annual European Symposium on Algorithms (ESA 2016)
Publication date: 01/01/2016
Field of study

Online matching has received significant attention over the last 15 years due to its close connection to Internet advertising. As the seminal work of Karp, Vazirani, and Vazirani has an optimal (1 - 1/epsilon) competitive ratio in the standard adversarial online model, much effort has gone into developing useful online models that incorporate some stochasticity in the arrival process. One such popular model is the "known I.I.D. model" where different customer-types arrive online from a known distribution. We develop algorithms with improved competitive ratios for some basic variants of this model with integral arrival rates, including: (a) the case of general weighted edges, where we improve the best-known ratio of 0.667 due to [Haeupler, Mirrokni and Zadimoghaddam WINE 2011] to 0.705; and (b) the vertex-weighted case, where we improve the 0.7250 ratio of [Jaillet and Lu Math. Oper. Res 2013] to 0.7299. We also consider two extensions, one is "known I.I.D." with non-integral arrival rate and stochastic rewards; the other is "known I.I.D." b-matching with non-integral arrival rate and stochastic rewards. We present a simple non-adaptive algorithm which works well simultaneously on the two extensions. One of the key ingredients of our improvement is the following (offline) approach to bipartite-matching polytopes with additional constraints. We first add several valid constraints in order to get a good fractional solution f; however, these give us less control over the structure of f. We next remove all these additional constraints and randomly move from f to a feasible point on the matching polytope with all coordinates being from the set {0, 1/k, 2/k,..., 1} for a chosen integer k. The structure of this solution is inspired by [Jaillet and Lu Math. Oper. Res 2013] and is a tractable structure for algorithm design and analysis. The appropriate random move preserves many of the removed constraints (approximately [exactly] with high probability [in expectation]). This underlies some of our improvements, and, we hope, could be of independent interest

Dagstuhl Research Online Publication Server

Adversarial Bandits with Knapsacks

Author: Immorlica Nicole
Sankararaman Karthik Abinav
Schapire Robert
Slivkins Aleksandrs
Publication venue
Publication date: 13/10/2019
Field of study

We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a well-known knapsack problem: find an optimal packing of items into a limited-size knapsack. The BwK problem is a common generalization of numerous motivating examples, which range from dynamic pricing to repeated auctions to dynamic ad allocation to network routing and scheduling. While the prior work on BwK focused on the stochastic version, we pioneer the other extreme in which the outcomes can be chosen adversarially. This is a considerably harder problem, compared to both the stochastic version and the "classic" adversarial bandits, in that regret minimization is no longer feasible. Instead, the objective is to minimize the competitive ratio: the ratio of the benchmark reward to the algorithm's reward. We design an algorithm with competitive ratio O(log T) relative to the best fixed distribution over actions, where T is the time horizon; we also prove a matching lower bound. The key conceptual contribution is a new perspective on the stochastic version of the problem. We suggest a new algorithm for the stochastic version, which builds on the framework of regret minimization in repeated games and admits a substantially simpler analysis compared to prior work. We then analyze this algorithm for the adversarial version and use it as a subroutine to solve the latter.Comment: Extended abstract appeared in FOCS 201

arXiv.org e-Print Archive